Skip to main content

22
UPLOADS


Media Type
15
texts
4
data
3
web
Topics & Subjects
8
Twitter
8
twitter
3
blacklivesmatter
2
BlackLivesMatter
2
ferguson
2
politics
More right-solid
Collection
More right-solid
Creator
15
ed summers
1
u.s. army military district of washington
Language
11
English
1
Dutch
SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Community Media
Nov 21, 2017 Ed Summers
texts
eye 18
favorite 0
comment 0
987,938 tweets retrieved that mentioned #PuertoRico over the period of October 4 to November 7, 2017. This was a period where there was increased concern being expressed in social media about the response to the humanitarian crisis caused by Hurricane Maria, which made landfall  on September 20. Tweets with ids greater than 919222753353457664 were collected from the streaming API, and the earlier tweets were collected using the search API. In both cases tweets using #PuertoRico were collected.
Topics: puerto rico, hurricane, twitter
Community Media
Oct 17, 2017 Ed Summers
texts
eye 31
favorite 0
comment 0
This dataset contains 17,292,130 tweet ids for tweets collected from the Twitter filter stream API for #blm and #blacklivesmatter between 2016-01-29 and 2017-03-18 using twarc. The files are broken into segments because of network connectivity problems, so there are varying time gaps present between the files. Also when the hashtags were trending globally rate limits may have prevented some tweets from being streamed over the API.
Topics: blacklivesmatter, twitter
Community Media
Oct 17, 2017 Ed Summers
texts
eye 58
favorite 2
comment 0
This dataset contains identifiers for 8,410,431 tweets that were collected between September 19, 2017 and October 5, 2017 that mentioned #CatalanReferendum, #CatalalonianReferendum, #Catalonia, #1oct, #1o or #votarem. These hashtags were used in the lead up to the Catalan Independence Referendum on October 1, 2017. The referendum was  declared illegal under Spanish law, and the Spanish police attempted to prevent it. The data collection was a collaboration with Vicenç Ruiz Gómez and Aniol...
Topics: politics, twitter, spain, catalan
Community Media
Sep 1, 2017 Ed Summers
texts
eye 36
favorite 0
comment 0
The 2017 solar eclipse occurred on August 21 and and was total for Oregon, Idaho, Wyoming, Nebraska, Kansas, Missouri, Illinois, Kentucky, Tennessee, North Carolina, Georgia, and South Carolina. This dataset includes 13,548,321 tweet identifiers for tweets that included any of the keywords solareclipse2017, solareclipse, eclipse2017, eclipseday or eclipse for the period August 17 to August 23, 2017. The hashtags were were selected after watching Twitter's streaming API for the trending hashtag...
Topics: Twitter, Eclipse, Astronomy, Social Media
Community Media
Aug 18, 2017 Ed Summers
texts
eye 32
favorite 0
comment 0
The Unite the Right rally (also known as the Charlottesville rally) was a protest in Charlottesville, Virginia, United States from August 11–12, 2017, to oppose the removal of a statue of Robert E. Lee in Emancipation Park, which itself was renamed from Lee Park two months earlier. Protesters included white supremacists, white nationalists, neo-Confederates, neo-Nazis, and militias. This dataset contains 200,113 tweet ids collected with the #unitetheright hashtag. Data collection was...
Topics: Twitter, Charlottesville
Community Media
Aug 9, 2017 Ed Summers
texts
eye 16
favorite 0
comment 0
The uploaded ids.txt.gz contains 32,056 tweets that mention "ferguson" between August 8 and August 10, 2014. They were collected on May 7th, 2015 from the search form on Twitter's website. Some important side effects to be aware of is that the dataset does not include tweets that were deleted before May 7th, 2015, and retweets are not included.
Topics: twitter, ferguson, blacklivesmatter
Community Texts
May 17, 2017
texts
eye 28
favorite 0
comment 0
Identifiers for 782,509 tweets that included the hashtag #macronleaks or #macrongate that were sent between 2017-05-10 16:14:51 and 2017-05-02 07:02:05 UTC. The tweets were collected from the Twitter Search API using twarc. The data does not include the first use of the #macrongate hashtag, but it does include the first use of the #macronleaks hashtag which went viral after Wikileaks retweeted it. More about the story of the #marconleaks hashtag can be found at:...
Topics: Politics, Twitter
Community Texts
Apr 27, 2017
texts
eye 40
favorite 0
comment 0
The tweet-ids.txt.gz file contains 10,159,892 identifiers for tweets and retweets sent by or to J. K. Rowling (jk_rowling) between 2015-07-08 and 2017-03-18. The tweets were collected with Social Feed Manager (m5_003).
Topic: Twitter
Community Media
Mar 28, 2017 Ed Summers
texts
eye 44
favorite 0
comment 0
This bag contains 2,711,011 tweets identifiers collected from the Twitter filter stream between 2017-02-09 and 2017-03-18 that used one or more of the following hashtags: alternativefacts, fakenews, truthiness, postfact, posttruth, factcheck.  The original tweets were collected using twarc.
Topics: twitter, journalism
Community Media
Dec 30, 2016 Ed Summers
texts
eye 47
favorite 0
comment 0
These are tweets that were collected between August 27, 2015 and January 4, 2016 that mention the word "trump". This period marked  important early months in the Republican primary. They were collected from Twitter's streaming API using twarc. There are 40,202,199 tweet identifiers in all.  Due to network outages  there are gaps at the following points: - 2015-08-27 19:12:37 - 2015-08-27 20:13:44 - 2015-11-02 02:02:13 - 2015-11-05 16:20:35 - 2015-12-28 02:02:42 - 2015-12-28...
Topics: twitter, politics, trump
The Archive Team Just In Time Grabs
Nov 1, 2016 Ed Summers
web
eye 1,407
favorite 0
comment 0
This package includes a wget capture of http://jobs.code4lib.org created by Ed Summers on 2016-10-31. The jobs.warc.gz file contains the  WARC output of the crawl, and jobs.code4lib.org.tar.gz is a mirror  copy that you can uncompress and mount on the web. The additional file  dump.json.gz is a JSON snapshot of the jobs, employers, subjects, locations and users present in the application before it was shut down. See https://github.com/code4lib/shortimer for more information
Topics: Jobs, Libraries, Programming
Community Software
Aug 4, 2016
data
eye 9
favorite 0
comment 0
PURL domain: /net/ndnp
Topic: purl_data_md
Community Media
Apr 18, 2016 Ed Summers
data
eye 34
favorite 0
comment 0
This collection includes two sets of Twitter identifiers for tweets mentioning panamapapers. panamapapers-search.txt This file contains 2,880,162 tweet identifiers collected by Ed Summers from the Twitter search API using the keyword 'panamapapers' using twarc v0.6.1. Data collection started at 2016-04-11 18:37:51 UTC and finished at 2016-04-13 11:17:57 UTC. The resulting tweets ranged from 2016-04-11 to 2016-04-03. panamapapers-stream.txt This file contains 4,815,339 tweet identifiers that...
Topics: Twitter, PanamaPapers
Community Media
Oct 19, 2015 Ed Summers
texts
eye 26
favorite 0
comment 0
On Friday, June 5, 2015, at a pool party in McKinney, Texas, a police officer was video-recorded restraining an unarmed African-American fifteen-year-old girl on the ground. He later drew his handgun during the same incident. This bag contains 180,000 identifiers for tweets containing the hashtag #McKinney that were sent between 20:15:53 and 23:46:26 on June 7, 2015. They were collected  by Bergis Jules at the University of California at Riverside in collaboration with the Maryland Institute...
Topics: McKinney, BlackLivesMatter, Twitter
Community Media
Oct 19, 2015 Ed Summers
texts
eye 24
favorite 0
comment 0
The shooting of Samuel DuBose occurred during a traffic stop for a missing front license plate on July 19, 2015, in Cincinnati, Ohio. Ray Tensing, a white University of Cincinnati police officer, fatally shot DuBose, a black man, when Dubose started his car and, according to Tensing, began to drive off. Tensing stated that he was being dragged when his arm became caught in the car. Prosecutors alleged that footage from Tensing's bodycam showed that he was not dragged and a grand jury indicted...
Topics: Twitter, BlackLivesMatter, BlackLivesMatter
Community Media
Jul 13, 2015 Ed Summers
texts
eye 20
favorite 0
comment 0
This package contains 10,406,506 #lovewins tweet identifiers for tweets that were sent in the aftermath of the Supreme Court's decision in Obergefell v. Hodges that was announced on June 26th, 2015. They cover the period of June 22, 2015 at 03:13:55 to July 02 at 09:03:39. The tweets ids were collected  starting on June 26, 2015 by doing a search of Twitter with twarc, and also setting up a filter stream capture at the same time, also with twarc. Twitter's Terms of Service only allow the tweet...
Topics: twitter, lovewins, lgbt
Ferguson Tweets
Nov 18, 2014 Ed Summers
data
eye 414
favorite 3
comment 1
This item represents a collection of 13,480,000 tweet IDs that mentioned 'ferguson' from 2014-08-10 to 2014-08-27 and 15,080,078 tweet IDs that mention "ferguson" between 2014-11-11 and 2014-12-08. The first set includes tweets for the two week period after the shooting of Michael Brown, and the second range includes tweets around the grand jury's decision not to indict police office Darren Wilson which was announced on 2014-11-24. The first set of tweets were collected by Ed Summers...
favoritefavoritefavoritefavoritefavorite ( 1 reviews )
Topics: twitter, ferguson, blacklivesmatter
Community Media
Aug 16, 2014
texts
eye 29
favorite 0
comment 0
Tweets from the Society of American Archivists 2013 meeting in New Orleans.
Topics: Twitter, Society of American Archivists
Community Media
Jun 30, 2014 U.S. Army Military District of Washington
data
eye 97
favorite 0
comment 0
These zip files were obtained on June 30, 2014 from:     https://www.rmda.army.mil/foia/FOIA_ReadingRoom/Detail.aspx?id=92 The FOIA request was submitted by Alexa O'Brien on January 9th, 2014.
Topics: Bradley Manning, Chelsea Manning, Wikileaks, FOIA, US Army, Transcripts
The Aaron Swartz Collection
Jan 17, 2013 Ed Summers
texts
eye 236
favorite 1
comment 0
This package contains Twitter JSON data for two Twitter search queries that were collected in the week following Aaron's death, using twarc that talks to the Twitter's search API: "Aaron Swartz" OR aaronsw and #pdftribute. aaronsw.json.gz contains 630,397 tweets, for the period starting with 2013-01-11 16:50:22 and ending 2013-01-18 13:50:02. pdftribute.json.gz contains 42,277 tweets, for the period starting with Jan 13 02:42:26 and ending Jan 17 03:33:46. In addition, the URLs...
Topic: aaronsw
Wikimedia miscellaneous files
Sep 18, 2012
web
eye 197
favorite 0
comment 0
wikitweets is a collection of Twitter messages that reference Wikipedia. They are collected by wikitweets whose code is available at Github .
The Archive Team Just In Time Grabs
Jul 17, 2012
web
eye 2,707
favorite 0
comment 0
The Kasabi data publishing platform created by Talis was announced to be closing on July 30, 2012. While the service has only been around for ~2 years it represents a unique look at services for Linked Data, and contains a variety of datasets. In a subsequent post Kasabi announced the availability of a spreadsheet that lists where datasets can be downloaded from Amazon S3. This spreadsheet has been uploaded to Internet Archive as datasets.csv , and each referenced dataset has been uploaded as...
Topics: data, rdf, linked data